distance metric
Learning semantic similarity in a continuous space
We address the problem of learning semantic representation of questions to measure similarity between pairs as a continuous distance metric. Our work naturally extends Word Mover's Distance (WMD) [1] by representing text documents as normal distributions instead of bags of embedded words. Our learned metric measures the dissimilarity between two questions as the minimum amount of distance the intent (hidden representation) of one question needs to travel to match the intent of another question. We first learn to repeat, reformulate questions to infer intents as normal distributions with a deep generative model [2] (variational auto encoder). Semantic similarity between pairs is then learned discriminatively as an optimal transport distance metric (Wasserstein 2) with our novel variational siamese framework. Among known models that can read sentences individually, our proposed framework achieves competitive results on Quora duplicate questions dataset. Our work sheds light on how deep generative models can approximate distributions (semantic representations) to effectively measure semantic similarity with meaningful distance metrics from Information Theory.
- Asia > India > Karnataka > Bengaluru (0.04)
- North America > United States > District of Columbia > Washington (0.04)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- (5 more...)
- Research Report > Experimental Study (0.92)
- Research Report > New Finding (0.67)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Leisure & Entertainment (0.93)
- Education (0.67)
- North America > United States > California (0.04)
- Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
- Asia > South Korea > Seoul > Seoul (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Asia > Middle East > Israel (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- Oceania > Australia (0.04)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- (2 more...)
- North America > United States > California > San Francisco County > San Francisco (0.05)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
- Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
- Health & Medicine > Therapeutic Area (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (0.47)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- North America > United States > Michigan (0.04)
- Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
- North America > Canada (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)